Skip to content

[BugFix] LoopUnswitching: gate non-trivial else behind PassConfig#1816

Merged
LeiWang1999 merged 2 commits intotile-ai:mainfrom
LeiWang1999:fix/loop-unswitching-nontrivial-else-config
Feb 9, 2026
Merged

[BugFix] LoopUnswitching: gate non-trivial else behind PassConfig#1816
LeiWang1999 merged 2 commits intotile-ai:mainfrom
LeiWang1999:fix/loop-unswitching-nontrivial-else-config

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Feb 8, 2026

This PR makes LoopUnswitching more conservative by default, and adds an opt-in PassConfig to re-enable more aggressive hoisting.

Motivation:

  • In kernels with later synchronization injection (e.g. partial thread sync / fence proxy), unswitching a per-thread predicate or duplicating a non-trivial else branch can introduce control-flow that breaks correctness.

Changes:

  • Default: only unswitch when the else-version of the loop body is side-effect free (no-op / Evaluate-only).
  • Add PassConfig (Python: ) to allow unswitching even when the else-version is non-trivial.
  • Keep the existing safety guard that blocks unswitching predicates that depend on threadIdx.*.
  • Add unit tests covering default vs enabled behavior.

This keeps the canonical optimization -> while avoiding code-size blowups and unsafe control-flow splits by default.

Summary by CodeRabbit

  • New Features

    • Added a pass configuration to allow loop unswitching even when else-branches have side effects (disabled by default).
  • Improvements

    • Enhanced loop-unswitching to detect and respect per-thread predicates and to conservatively avoid unsafe hoisting unless explicitly enabled.
  • Tests

    • Added tests covering config-driven unswitching, non-trivial else-branch handling, and per-thread predicate behavior.

@github-actions
Copy link

github-actions bot commented Feb 8, 2026

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

📝 Walkthrough

Walkthrough

Adds a new PassConfig option to allow unswitching when else-branches have side effects, introduces let-binding-aware variable usage checks and side-effect detection, and extends the loop-unswitching pass to track per-thread (threadIdx) variables and honor the new config when deciding to hoist conditionals.

Changes

Cohort / File(s) Summary
Configuration & Headers
src/op/builtin.h, src/op/builtin.cc, tilelang/transform/pass_config.py
Added a new pass-config key/attribute tl.loop_unswitching_allow_non_trivial_else / TL_LOOP_UNSWITCHING_ALLOW_NON_TRIVIAL_ELSE and registered it in the C++ pass config.
Loop Unswitching Pass
src/transform/loop_unswitching.cc
Introduced UsesVarsThroughLetBindings(...), IsSideEffectFreeStmt(...); extended IsLoopInvariant(...) and HoistableIfFinder to accept disallowed-vars; added LoopUnswitcher(bool allow_non_trivial_else) with thread_idx_vars_in_scope_ tracking; changed ApplyLoopUnswitching signature and pass hook to pass the new flag; skip unswitching for per-thread predicates or non-trivial else unless enabled.
Tests
testing/python/transform/test_tilelang_transform_loop_unswitching.py
Added _check_with_config() helper and new tests exercising thread-local predicate rejection and config-driven hoisting of non-trivial else/other statements.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • chengyupku

Poem

🐰 I hop where loops and let-bindings play,
ThreadIdx warns me: "Don't unswitch today."
A config nudge, a careful, clever tweak,
Now hoists are safe — the rabbit gives a squeak! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.84% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: making loop unswitching more conservative by default and gating non-trivial else hoisting behind a PassConfig option.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/transform/loop_unswitching.cc (3)

526-526: thread_idx_vars_in_scope_ should be private.

This is an internal implementation detail of the mutator. Being public allows accidental access from outside the class.

Move to private section
 class LoopUnswitcher : public StmtExprMutator {
 public:
   explicit LoopUnswitcher(bool allow_non_trivial_else)
       : allow_non_trivial_else_(allow_non_trivial_else) {}
 
-  std::unordered_set<const VarNode *> thread_idx_vars_in_scope_;
-
   Stmt VisitStmt_(const ForNode *op) final {
   ...
 
 private:
   bool allow_non_trivial_else_{false};
+  std::unordered_set<const VarNode *> thread_idx_vars_in_scope_;
 };

528-629: Consider an RAII guard for thread_idx_vars_in_scope_ cleanup to avoid duplication across four early-return sites.

The pushed_thread_idx / erase pair is repeated at lines 559, 579, 601, and 625. A scope guard would consolidate this and prevent omission in future edits.

Example with a simple scope guard
   Stmt VisitStmt_(const ForNode *op) final {
     bool pushed_thread_idx = false;
     if (op->thread_binding.defined()) {
       String thread_tag = op->thread_binding.value()->thread_tag;
       if (thread_tag == "threadIdx.x" || thread_tag == "threadIdx.y" ||
           thread_tag == "threadIdx.z") {
         thread_idx_vars_in_scope_.insert(op->loop_var.get());
         pushed_thread_idx = true;
       }
     }
+    // RAII cleanup for thread_idx tracking
+    auto cleanup = [&]() {
+      if (pushed_thread_idx) {
+        thread_idx_vars_in_scope_.erase(op->loop_var.get());
+      }
+    };
+    struct ScopeGuard {
+      std::function<void()> fn;
+      ~ScopeGuard() { fn(); }
+    } guard{cleanup};

Then remove the four individual if (pushed_thread_idx) { ... erase ... } blocks and the manual cleanup before each return.


296-346: AttrStmtNode is not handled — may skip valid optimizations.

IsSideEffectFreeStmt returns false for any AttrStmtNode. Since HoistableIfFinder will descend into AttrStmt bodies and find loop-invariant ifs, but IfBranchReplacer preserves the AttrStmt wrapper when replacing the matched if with Evaluate(0), the resulting else-branch will contain AttrStmt(..., Evaluate(0)) which fails the side-effect-free check and blocks unswitching.

While the conservative approach is intentional, AttrStmt is a common annotation in TVM TIR for thread_extent, storage_scope, and async regions. Other transform passes in the codebase routinely check AttrStmtNode; handling it would enable optimizations in those patterns.

Consider adding:

   if (const auto *op = stmt.as<ForNode>()) {
     if (SideEffect(op->min) > CallEffectKind::kReadState ||
         SideEffect(op->extent) > CallEffectKind::kReadState) {
       return false;
     }
     return IsSideEffectFreeStmt(op->body);
   }
 
+  if (const auto *op = stmt.as<AttrStmtNode>()) {
+    if (SideEffect(op->value) > CallEffectKind::kReadState) {
+      return false;
+    }
+    return IsSideEffectFreeStmt(op->body);
+  }
+
   // Conservatively treat all other statements as side-effecting.
   return false;

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LeiWang1999 LeiWang1999 merged commit e9d0569 into tile-ai:main Feb 9, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant